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Abstract We investigate here a new version of the Calculus of Inductive Construc- 
tions (CIC) on which the proof assistant Coq is based: the Calculus of Congruent 
Inductive Constructions, which truly extends CIC by building in arbitrary first-order 
decision procedures: deduction is still in charge of the CIC kernel, while computa- 
tion is outsourced to dedicated first-order decision procedures that can be taken from 
the shelves provided they deliver a proof certificate. The soundness of the whole 
system becomes an incremental property following from the soundness of the cer- 
tificate checkers and that of the kernel. A detailed example shows that the resulting 
style of proofs becomes closer to that of the working mathematician. 

1 Introduction 

Proof assistants based on the Curry-Howard isomorphism such as Coq ||9l allow to 
build the proof of a given proposition by applying appropriate proof tactics available 
from existing libraries or that can otherwise be developed for achieving a specific 
task. These tactics generate a proof term that can be checked with respect to the rules 
of logic. The proof-checker, also called the kernel of the proof assistant, implements 
the deduction rules of the logic on top of a term manipulation layer. In this model, 
the mathematical correctness of a proof development relies entirely on the kernel. 
Trusting the kernel is therefore vital. 

The (intuitionist) logic on which Coq is based is the Calculus of Constructions 
(CC) of Coquand and Huet lilOi . an impredicative type theory incorporating poly- 
morphism, dependent types and type constructors. Unlike logics without dependent 
types, CC enjoys a powerful type-checking rule, called conversion, which incorpo- 
rates computations within deductions, making decidability of type-checking a non- 
trivial property of the calculus. 

In CC, computation reduces to (pure) functional evaluation in the underlying 
lambda calculus. The notion of computation is richer in the Calculus of Inductive 
Constructions of Coquand and Paulin (CIC), obtained from CC by adding inductive 
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types and the corresponding rules for higher-order primitive recursion IfTTI . The 
recent versions of Coq are based on a sHght generaHzation of this calculus ifTSl . Still, 
such a simple function as reverse of a dependent list cannot be defined in CIC as one 
would expect, because {reverse I :: I') and {reverse I') :: {reverse I), assuming :: is 
list concatenation, have non-convertible types list{n + m) and list{m + n), assuming 
{reverse I) has for type the type of its argument I. This is so because the usual 
definition of + by induction on one of its arguments does not reduce the proof of 
m + n = n+ m to a computation. 

We do believe that scaling up the proof development process requires being able 
to mimic the mathematician when replacing the proof of a proposition P by the 
proof of an equivalent proposition P' obtained from P thanks to possibly complex 
calculations in which easy steps are hidden away. It is our program to make this 
view a reality. 

A way to incorporate decision procedures to Coq is by developing a tactic and 
then use a reflexion technique to omit checking the proof term being built by proving 
the decision procedure itself. But the soundness of the entire mechanism cannot be 
guaranteed in general 1(121 . Further, this does not answer the question of hiding easy 
steps away. 

A first attempt towards our goal is the Calculus of Algebraic Constructions 
(CAC), obtained by adding to CC user-defined computations as rewrite rules |5][3|. 
Although conceptually quite powerful since CAC captures CIC [4], this paradigm 
does not yet fulfill all needs. In particular, the user needs to hide away the easy steps 
by himself, that is by giving the necessary rewrite rules and by verifying that they 
satisfy the assumptions of the general schema 15] 13). 

The proof assistant PVS uses a potentially stronger paradigm than Coq by com- 
bining its deduction mechanism with a notion of computation based on the powerful 
Shostak's method for combining decision procedures 1201 , a framework dubbed lit- 
tle proof engines by Shankar lfT9l . Indeed, the little engines of proof hide away 
the easy computational steps, without any user assistance. Unfortunately, proof- 
checking is not decidable in PVS. Further, since the little engines of proofs involve 
complex coding, as well as Shostak's algorithm itself, one can only believe a PVS 
proof, while one can check and trust a Coq proof. 

Two steps in the direction of integrating decision procedures into CC are Stehr's 
Open Calculus of Constructions (OCC) lISTI and Oury's Extensional Calculus of 
Constructions (ECC) ifTTl . Implemented in Maude, OCC allows for the use of an 
arbitrary equational theory in conversion. ECC can be seen as a particular case of 
OCC in which all provable equalities can be used in conversion, which can also 
be achieved by adding the extensionality and Streicher's axioms to CC |22J, hence 
the name of this calculus. Unfortunately, strong normalization and decidability of 
type checking are then lost, which shows that we should seek for more restrictive 
extensions. 

In a preliminary work, we designed a new, quite restrictive framework, the Calcu- 
lus of Congruent Constructions (CCC), which incorporates the congruence closure 
algorithm in CC's conversion f7l|, while preserving the good properties of CC, in- 
cluding the decidability of type checking. In [6|, we have described CCpj, in which 
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the decision procedure was Presburger arithmetic and strong elimination ruled out. 
The present work is a continuation of the latter. 

Theoretical contribution. Our main theoretical contribution is the definition and 
the meta-theoretical investigation of the Calculus of Congruent Inductive Construc- 
tions (CCIC), which incorporates arbitrary ^rsf-orc/er theories for which entailment 
is decidable into deductions via an abstract conversion rule of the calculus. A major 
technical innovation of this work lies in the computation mechanism: goals are sent 
to the decision procedure together with the set of user hypotheses available from the 
current context. Our main result shows that this extension of CIC does not compro- 
mise its properties: confluence, strong normalization, coherence and decidability of 
proof-checking are all preserved. 

Unlike previous calculi, the difficulty with CCIC is not strong normalization, for 
which we have reused the strong normalization proof of CAC [|3]. A major diffi- 
culty was a traditional step towards subject-reduction: compatibility of conversion 
with products. Decidability of type checking required restricting conversions below 
recursors [231. 

Practical contribution. We give several examples showing the usefulness of this 
new calculus, in particular for using dependent types such as dependent lists, which 
has been an important weakness of Coq until now. Further studies are needed to 
explore other potential applications, to match inductive definition-by-case modulo 
theories of constructors-destructors, another very different weakness of Coq. A de- 
tailed example shows that the resulting style of proofs becomes closer to that of the 
working mathematician. 

Methodological contribution. The safety of proof assistants is based on their 
kernel. In the early days of Coq, the safety of its kernel relied on its small size and 
its clear structure reflecting the inference rules of the intuitionist type theory, CC, 
on which it was based. The slogan was that of a readable kernel. Moving later to 
CIC allowed to ease the specification tasks, making the system very popular among 
proof developers, but resulted in a more complex kernel that can now hardly be read 
except by a few specialists. The slogan changed to a provable kernel, and indeed one 
version of it was once proved with an earlier version (using strong normalization as 
an assumption), and a new safe kernel extracted from that proof. 

Of course, there has been many changes in the kernel since then, and its correct- 
ness proof was not maintained. This is a first weakness with the readable kernel 
paradigm: it does not resist changes. There is a second which relates directly to 
CCIC: there is no guarantee that a decision procedure taken from the shelf imple- 
ments correctly the complex mathematical theorem on which it is based, since car- 
rying out such a proof may require an entire PhD work. Therefore, these procedures 
cannot be part of the kernel. 

Our solution to these problems is a new shift of paradigm to that of an iyicre- 
mental kernel. The calculus on which a proof assistant is based should come in two 
parts: a stable calculus implementing deduction, CIC in our case, which should sat- 
isfy the readable or provable kernel paradigm; a collection of independent decision 
procedures implementing computations, that produce checkable proof certificates. 
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The certificate checker should of course itself satisfy the readable or provable code 
paradigm. Note that a Coq proof is a particular case of a checkable certificate. 

This paradigm has many advantages. First, it allows for a modular, cooperative 
development of the system, by separating the development of the kernel from that 
of the decision procedures. Second, it allows for an unsafe mode in case a decision 
procedure is used that does not have a certificate generator yet. Third, it allows to 
better trace errors in case the system rejects a proof, by using decision procedures 
that output explanations when they fail. Last, it allows the user to use any decision 
procedure she needs by simply hooking it to the system, possibly in unsafe mode. 

This incremental schema is quite flexible, assuming that decision procedures 
come one by one. However, even so, they are not independent, they must be com- 
bined. Combining first-order decision procedures is not a new problem, it was con- 
sidered in the early 80's by Nelson and Oppen on the one hand, by Shostak on the 
other hand, and has generated much work since then. There are several possibilities 
to build in this mechanism: in the kernel, via a certificate generator and checker 
again, or by reflection. This design decision has not been made yet. 

2 Congruent Inductive Constructions 

The Calculus of Congruent Inductive Constructions (CCIC) is an extension of CIC 
which embeds in its conversion rule the vaUdity entailment of a fixed first order the- 
ory. First, we recall the basics of CIC before to introduce parametric multi-sorted 
algebras and then embed these first-order algebras into CIC. We are then able to 
define our calculus relative to a specific congruence that is defined last. For simplic- 
ity, we will only consider here the particular case of parametric lists and that of the 
natural numbers equipped with Presburger arithmetic. This simple case allows us to 
build lists of natural numbers, as well as lists of Usts of natural numbers, and so on. 
It indeed has the complexity of the whole calculus, which is not at all the case when 
natural numbers only are considered as in ||6| . 

2.1 Calculus of Inductive Constructions 

Terms. We start our presentation by first describing the terms of CIC. 

CIC uses two sorts: * (or Prop, or object level universe), □ (or Type, or predicate 
level universe) and A. We denote □, A}, the set of CIC sorts, by 

Following the presentation of Pure Type Systems (PTS) |fT4ll , we use two classes 
of variables: ^* and S^'^ are countably infinite sets of term variables and predicate 
variables such that J"* and are disjoint. We write 3^ for ^* U 3t'°. 

We shall use u for a list (mi , . . . ,m„), s for a sort in x,y, . . . for variables in 
X, ... for variables in jr°. 

Definition 1 (Pseudo-terms). The algebra ^ of pseudo-terms of CIC is defined by: 

f,M,r,t/,... :=i e ^ |xe I V(x: r).f | X[x:T].t 

\ tu\ Ind(X : t){Ti} I fW | Elim(f : T [m7] -> U){w]} 
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The notion of free variables is as usual - the binders being A, V and Ind (in 
Ind(X : t){Ti}, X is bound in the li's). We write FV(f) for the set of free variables 
of f . We say that t is closed if FV(f ) = 0. A variable x freely occurs in t if x G FV(f). 

Inductive types. The novelty of CIC was to introduce inductive types, denoted 
by / = Ind(X : r){C,} where the C,'s describe the types of the constructors of /, and 
T the type (or arity) of / which must be of the form V(x, : ?]).★. The A:-th constructor 
of the inductive type /, of type QjX i-^ /}, will be denoted by /W. 

As an easy first example, we define natural numbers: nat := Ind(X : *){X,X -^X}. 
We shall use and S as constructors for natural numbers, of respective types nat 
and nat nat, obtained by replacing X by nat in the above two expressions X and 
X ^ X. Elimination rules for nat are as follows: 

ElimN(0,e){vo,vs} -^vo 

ElimN(Sx, Q){i'o, vs} vsx(ElimN(x, e){vo, vs}) with 2 : nat ^ s, e Y. 

Similarly, we now define parametric lists: list := A[r : *].Ind(X : *){X,T ^ X ^ X}. We 
shall use nil and cons as constructors for parametrized lists, of respective types 
V(r : *).list(r) and V(r : T list(r) list(r). EUmination rules for list are: 

ElimL(nil, g) {v„ii , Vcons } i'„ii 

ElimL{conS xl,Q){v^i, Vcons } Vcons X / ElimL (/ , 2) {I'nii , Vcons } ) 

Finally, we define dependent words over an alphabet A: 

word = Ind(X : nat ^ *) {X 0, A ^ X (S 0) , V(.y, z : nat) .Xy ^ X z ^ X{y + z)} 

We shall use e, char and app for its three constructors, of respective types wordO, 
A^word(SO), and V(«,m : nat). word« ^ wordw ^ word(n +ot) obtained as 
previously by replacing X by word in the three expressions XO,A — >X (SO), and 
V (y, z : nat) .X y X z X{y + z). Elimination rules for dependent words are : 

ElimW(e,0){vE,Vchar,Vapp} -^I'E 
ElimW(charx, 2) {v^ , Vchar , Vapp } I'char X 

EUmW(app nmll' ,Q){ve, Vchar , Vapp } Vapp nmW {ElimW(/ , 2) {i'e , x'char , I'app } ) 

(ElimW(Z',2){vE, Vchar, Vapp}) 

Definitions by induction. We can now define functions by induction over natural 
numbers, lists or words. Since using the CIC syntax is a bit painful, we give only a 
quite simple example defining append (written @) for lists of natural numbers, of 
type V(r : ★).list(r) ~> list(r) list(r): 

fA[x:nat][/":listnat]. \ 
I', A [/I :listnat][/2:listnat]. \ 
A[L: 2/l/2].consxL j 

Strong and Weak reductions. CIC distinguishes strong i -elimination when 
the type Q of terms constructed by induction is at predicate level, from weak i- 
elimination when Q is at object level. Strong elimination is restricted to small in- 
ductive types to ensure logical consistency 1241 . 
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Typing judgments. A typing environment F is a sequence of pairs x,- : 7) made 
of a variable x,- and a term 7] (we say that F binds x, to the type 7;), such that F does 
not bind a variable twice. The typing judgments are classically written F \- t :T, 
meaning that the well formed term f is a proof of the proposition T (has type T) 
under the well formed environment F. xF will denote the type associated to x in F, 
and we write dom(F) for the domain of F as well. 

The typing rules of CIC given in[T|are made of the typing rules for CC and the 
typing rules for inductive types, given for the particular case of nat and list. 



[Ax-1] fAx-2] 

h*:n l-n:A 

r\-T:sT r,[x:T]\~U -.su 



r^\/(x:T).U -.su 
r\-y{x:T).U -.s r,[x:T]\-u:U 



[Prod] 



rhv:s rhf.T i £{*,□} 

-^e.r'-dom(r) [Weak] 

r,[x:V]\-t -.T 

X e dom(r) n jr-'v r h xF : i,- 



r\- X[x:T].u:y{x:T).U 
Fhf :V(x:J/).V F h u : f/ 



■ [Abs] 



F h .t : xF 
r\-t:T r\-T' -.s' 



T' 



r\-tu:V{x^it} 



[App] 



[Symb] 



F h G : nat ^ .s e {*, □} 
Fhn:nat Fhvo:20 
Fhvs:V(p:nat).ep^e(Sp) 



ElimN(n,Q){vo.vs}:Gn 
Fig. 1 CIC typing rules for nat and list 



[Elim] 



rhf.T' 

FhT:* Fl-p:nat Fh/: listTp 

F h (2 : V(n : nat) . list Tn^ s £{*,□} 
Fhv„i, :eO(niir) 

, V(x : r) (h : nat) (/: list Th). 
Qnl^Q{Sn){consTxn!) 

ElimL(/,(2){vo,vs}:ep/ 



■ [Var] 
[CONV] 



Fh Vc 



[Elim] 



We did not give the general typing elimination rule for arbitrary inductive types, 
which is quite complicated. Instead, we gave the elimination rules obtained for our 
three inductive types nat, list and word. We refer to ifTSl 1241 for the general case, 
and for the precise typing rule of ElimW. 



2.2 Parametric sorted algebras 



Parametric sorted signature. Order-sorted algebras were introduced as a formal 
framework for the OBJ language in |fT3]| , before to be generalized as membership 
equational logic in |8]. We use here a polymorphic version of a restriction of the 
latter, by assuming given a signature (A,Z), A for the sort constructors, and Z for 
the function symbols made of a set of constructors for each sort constructor, and of 
a set of defined symbols. We shall use the notation / : Va. Ci x ■ ■ ■ x (7„ T for 
symbol declarations. As an example, we describe natural numbers and parametric 
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(non-dependent) list using an OBJ-like syntax. We rule out here partiality, as intro- 
duced in practice by destructor symbols, for sake of clarity. 

We shall use 'f = {a,j3,...} for the set of sort variables, and .'7(l,,'f) = 
{a, T, . . .} for the set of sort expressions. 

nat ^ nat 
nat X nat nat 
list(a) 

a X list(o;) — > list{a) 
list(a) X list(o;) list(a) 

Definition 2 (Terms). For any sort a, let j?r'^ be a countably infinite set of variables 
of sort a, s.t. all the ^^'s are pairwise disjoint. Let =- . For any x € 3^ , 

we say that x has sort (7 if x e X'^ . For any sort a, the set 3o-(i^, ^ ) of terms of 
sorts a with variables !X is the smallest set s.t.: 

1. ifjce ^^thenxe ^T(i;), 

2. if fi,--- ,f„ e X ••• X .%^^{1.,SC) where/ : Va.cJi x ••• x C7„ ^ t 
and ^ is a sort substitution, then f(t\ ,tn) e 3^^^ {£,,!%'). 

Let ^(Z, J") = \J^{%{Z, SC:)). A term t has sort a if f £ JT). 

Note that the sets 93° play the role of a typing context. 

Example 1. Assuming that x is a variable of sort nat, then and + x are of sort 
nat, while nil is of sort list(a), list(nat), list(list(nat)), etc. 

Definition 3 (Equations). Equations t ~" u are pairs of terms of the same sort a. 

Example 2. Assuming x of sort nat and / of sort list(list((nat)), x + x is an 
equation of sort nat and cons(x,nil) ='"'(nat) car{l) is an equation of sort list(nat). 

We can therefore as usual build parametrized algebras for list, algebras for nat 
and therefore get algebras for nat, list(nat), etc. Satisfaction of an equation in these 
algebras is defined as usual. In practice, type superscripts may be omitted when they 
can be infered from the context. 

2.3 Embedding parametric algebras in CIC 

Our purpose here is to embed parametric multi-sorted algebra into CIC. As a result, 
two different, but related kinds of symbols will coexist, in CIC and in the embedded 
algebraic sub-world. We shall distinguish them by underlying symbols in CIC. 

The first step of the translation maps, respectively sort constructors and construc- 
tor symbols to CIC inductive types and constructors. We start with natural numbers 
and its sort constructor nat. Constructor symbols of nat are simply all the construc- 
tors symbols whose codomain is nat, i.e. here and S. We thus define nat (the CIC 
inductive type attached to nat) as an inductive type with two constructor types (one 
for 0, and one for S): nat := Ind(X : *){Ci {X),C2{X)}. 



sort nat 
sort list 
svar a 
cons 



nat 



cons a 
fun + 
cons nil 
cons cons 
fun @ 
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The constructor types of nat are simply the arities of and S where nat is re- 
placed with the constructor type variable: Ci (X) = X and C2{X) ^X ^ X. As ex- 
pected, we obtain here the standard inductive definition of natural numbers given 
in SectionO lnd(X : *){X,X X}. The translation of (resp. S of S) is then 
simply natl'l (resp. nat PI)- 

Translating list is not very different. Being of arity 1, with two associated con- 
structor symbols (nil and cons), list is mapped to the already seen parametrized 
inductive type list ~ X[A : T].lnd{-k){X ,A X ^ X}. Translation of constructors 
is done the same way. We just need to care about curryfication of symbols, and to 
replace sort variables with CIC type variables. 

Finally, defined symbols are mapped to CIC defined symbols, after translating 
their type appropriately. 

2.4 Building in a first-order theory 

We now start describing our new calculus CCIC. 

Terms. CCIC uses the same set of sorts = □, A} and sets of variables 
^ = ^* U of CIC. For any sort d e A , let JT^ C ^* a infinite set of variables 
of sort (7 s.t. {^a}a is a family of pairwise disjoint sets. We also assume that 

— Uct ^'a is infinite. 

Let £/ ~ {r,u} a set of two constants, called annotations, totally ordered by 
u r, where r stands for restricted and u for unrestricted. We use a for an arbitrary 
annotation. The role of annotations will be explained later 

Definition 4 (Pseudo-terms of CCIC). Given a parametric sorted signature (A , Z), 
the algebra ^ of pseudo-terms of CCIC is defined as: 

t,u,T,U,...:^sey\xe^\\/{x:"T).t\X[x:"T].t\tu\feE\aeA 

I ^ I Eqj.{t) I lnd(X : t){Ti} | f W | Elim(f : T [777] ^ U){W]} 

In order to make definitions more convenient, we assume in the following that A 
contains the symbols =,nat and list, and that E contains the symbols 0,S and Eq. 
Compared with CIC, the differences are: 

• the internalization of the first-order symbols, 

• the internalization of the equality predicate: 

- 1 =7- u denotes the equality of the two terms (of type T) t and u, 

- Eqj (f ) represents the reflexivity proof of t =7- f . 

• annotations in products and abstractions are used to control the formation of 
appUcations as it can be seen from the new [APP] rule given at Figure|2l 

Notation 2.1 When x is not free in t, V(x :" T).t is written T — t. The default 
annotation, when not specified in a product or abstraction, is the unrestricted one. 

As usual, there is a layered set of syntactic classes for 
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Definition 5 (Syntactic classes). The pairwise disjoint syntactic classes of CCIC 
called objects (^), predicates {^), kinds (J^), kinds predicates {^), and A are 
defined as usual: 

-ff:--Sr*\feE\ffff\ff^\?i[x*:"lP].ff\X[x° f X\. \ Elim{^ : &>\S\^ G){G} 

- ,'y>::=X°\aeA \ g^G \ .9> 3^ \ X[x* f g^\.S^\ f X\.&> 

I Elim(^ : ^ ^ &>){^} I V(x* f 3^). 3> \ V(x° f X).3> 

- X ■.■--k\X e\X 3^\X\x* 3>\.X \ A[x° f Jf].J(f\ V(x* 3^).Jf \ V(x° :« J(f).je 

- ^ ::= □ I V(x* g>).^ \ V(x° :" .J(f).^ 

- A::= A 

This enumeration defines a successor function +1 on classes {&+\ = 3',;y'+\ = 
+ 1 = ^ + 1 = A). We also define Class(r) = ^ if f G ^ and ^ G 

From now on, we only consider well-constructed terms (i.e. terms whose class is 
not 1) and well-constructed substitution (i.e. substitutions s.t. Class(x) = Class(x0) 
for any x in its domain). It is easy to check that if f is a well-constructed term and 
a well-constructed substitution, then Class (?) = Class (f0). It is also well-known 

that ^-reduction preserves term classes. 

Definition 6 (Pseudo-contexts of CCIC). The typing environments of CIC are de- 
fined as r,A [] | F, [x T] s.t. a variable cannot be declared twice. We use 
dom(F) for the domain of F and xT for the type associated to x in F. 

The rules defining the CCIC typing judgment F h f : T are the same as for CIC 
except the rules for application and conversion given at Figure |2l 

r^t:\l(xfU).V r\-u:U 

if = r and [/ ?i =r f2 with t^ti € ff 

then fi ~r?-> must hold 
[App] 

rhtii: V{x^u} 
Fig. 2 CCIC modified typing rules 

2.5 Conversion 

We are now left with defining the conversion relation whose definition needs 
some preparation, since: 

• conversion is defined on CCIC terms, but the first-order decision procedures op- 
erate on algebraic terms. We therefore need to translate CCIC terms into alge- 
braic terms, a process we call algebraisation. 

• conversion will operate on weak terms only, a notion introduced in Section |23] 
Non-weak terms will be converted with j3 1 -reduction only, to forbid lifting up in- 
consistencies from the object level to the type level. This is crucial to avoid break- 
ing strong normalization, and therefore decidability of type-checking in presence 
of inconsistent user's assumptions. 



rhf.T rhT':s' T^rT' 

[Cony] 

r\-t:T' 
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Algebraisation. Our calculus has a complex notion of computation reflecting 
its rich structure made of three ingredients: the typed lambda calculus, the induc- 
tive types with their recursors and the integration of the first order theory 5^ in its 
conversion. To achieve this integration, goals are sent to the first order theory 5^ 
together with a set of proof hypotheses extracted from the current context. 

Algebraisation is the first step of this extraction: it allows transforming a CCIC 
term into its first-order counterpart. We illustrate this with an example, 3' being 
Presburger's arithmetic. 

We begin by the simplest case, directly taken from CCpj, the extraction of pure 
algebraic, non parametric, equations. Suppose that the proof environment contains 
equations of the form c = 1 + (i and c/ = 2 with c and d variables of sort nat. What 
is expected is that the set of hypotheses sent to the theory S' contains the two well 
formed 5^-formulas c—\^d and d — 2. This leads to a first definition of equations 
extraction: 

1. a term is algebraic if it is of the form 0, or St, or f + m, or x £ ,9^1%. The alge- 
braisation £/{t) of an algebraic term is then defined by induction: i/(0) = 0, 
£/{St) = S{£^{t)), £/{t + u) = £/{t) + £^{u) and ^/(jcn) = -^N, 

2. a term is an extractable equation if it is of the form t = u with t and u algebraic 
terms. The extracted equation is then i/(r) = (m). 

The definition becomes harder for parametric signatures. The theory of lists gives 
us a paradigmatic example. From the definition of embedding a polymorphic multi- 
sorted algebra into CIC, we know that the symbol @ has V(r : *).listr listr listr 
for type. Thus, a fully applied, well formed term having the symbol @ at head 
position must be of the form {@T 1112), T being the type of the elements of the 
lists II and 12. Algebraisation of such a term will erase all type parameters: in our 
example, ^(@r/lZ2) = @{j^{ll),j^{l2)). 

Algebraisation of non-pure algebraic terms is done by abstracting non-algebraic 
subterms with fresh variables. For example, algebraisation of 1 + ? with t non- 
algebraic will lead to 1 + Xnat where Xnat is an abstraction variable of sort nat for 
t. Of course, if the proof context contains two equations of the form c = 1 + f and 
d=l+u with t and u j3 1 -convertible, t and u should be abstracted by a unique vari- 
able so that c = d can be deduced in ,5^ from c ~ I +3'nat and d ^ I + >'nat- The 
problem is harder for: 

• parametric symbols: in (cons T f (nil [/)) with f non algebraic, should t be ab- 
stracted by a variable of sort nat or list (nat) ? 

• ill-formed terms: should (cons T (cons T (nil f/ ) (nilT))) be abstracted as a list 
of natural numbers or as a list of lists ? 

Our solution is to postpone decisions: jz/ (f ) will be a function from A to the terms 
of =5^ s.t. s^{t){a) is the algebraisation of t under the condition that f is a CCIC 
representation of a first order term of sort a. 

We now give the formal definition of ^(•). We assume: 

- a A-sorted family {'3^a}a of pairwise disjoint countable infinite sets of variables 
ofsortff. LetSJ^ = Ua%; 



From Formal Proofs to Mathematical Proofs 



11 



- for any equivalence relation M and sort a G A, we assume a function tt^ : 
CCIC(^) % s.t. n%{t) = n%{u) if and only if f ^ m (i.e. nf^{t) is the element 
of % representing the class of t modulo M). 

Definition 7 (Well applied term). A term is well applied if it is of the form 
/ [T^aecf fi • • • f» with / ; Va. ai X • • • X C7„ ^ C7. 

Example 3. Example of well applied terms are 0, St, or cons Tx/, T being the type 
parameter here. Note that we do not require the term to be well formed. 

In case of partial symbols, such as car for lists, this definition must be changed 
slightly by adding a new argument, the proof that the input satisfies the appropriate 
guard, here that it is not nil. 

Definition 8 (Algebraisation). The algebmisation of t £ CCIC modulo an equiva- 
lence relation 0^. is the function a;(t) : A ^ JT* U 'W) defined by; 

£/^ifT[ui]ien)iT^) = (mi)((Ji^), . . . ,i/«(M„)(c7„^)) 

^,^(f)(T) = 7r|,(f) otherwise 
where / : Va . di x • • • (7„ ^ cj, / T [ 

Ui\ien is well applied, and £, is a A -substitution. 
For any relation R, £/r is defined as £/ ^ where ^ is the smallest equivalence 
relation containing R. We call a-alien (or alien when the context is clear) a subterm 
of t abstracted by a variable in and say that t is algebraic w.r.t. a if contains 
no (7-alien. We denote by ^lg(j the set of algebraic terms w.rt. a, and by s^\g = 
UcTGA '^^^a '^he set of algebraic terms. 

Example 4. Let t = cons T (cons C/ (nilV) (nilf/)), be a relation on CCIC terms, 
(7 = list(nat), and XnatiJiistiZnati^a andja be abstraction variables. Then: 

s^R(t)(a)= cons {s/r{Q) (nat) , s^r (cons U (nil V ) (nil (7) ) (cr ) ) 

= cons (0, cons(i/ r (nil V) (nat) , jz/ r (nil i7) (ct) ) ) = cons(0, cons(xnat , nil) ) 
i/i;(f)(list(<T)) = cons(.f^if(0)((7), i/^(consf/(nilV) (nii;7))(list((7))) 

= cons (>'iist , cons (.^t/;; (nil V ) ( (7 ) , (nil (7) (list ( C7 ) ) ) ) = cons {ywst , cons ( nil , nil) ) 

^fl(f)(list(a)) ==cons(xa,cons(ya,nil)) and ^x(f)(nat) =z„at. 

It is clear from the above example that the algebraisation of a term depends on 
the expected sort of the result: when abstracting the (heterogeneous and ill-formed) 
list :: nil :: nil as a list of lists, is seen as an alien which must be abstracted. 
When this list is abstracted as a list of natural numbers or as a polymorphic list, 
is considered algebraic and the first occurrence of nil as an alien to be abstracted. 
Finally, if the list is algebraised as a natural number, it is abstracted by a variable. 

Weak terms. We first distinguish a class of terms called weak. This class of terms 
will play an important role in the following as they restrict the interaction between 
the conversion at object level and the strong i-reduction. 

An example of non weak term is 

t^X[x: nat] . Elim-^ [x : nat [] Q) {nat, A [x : nat] [T -.Qx]. nat nat} 
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Such a term is problematic in the sense that when applied to convertible terms, 
it can j3l-reduce to type-level terms that are not /3l -convertible. Suppose that the 
conversion relation is canonically extended to CCIC. Assume a typing environment 
r s.t. 0~rSO, and hence, by congruence, f ? (SO). Now, it is easy to check that 

1 nat and t (S 0) (nat nat) . Strong normalization of j5 -reduction is then 
broken by encoding the term co = X[x: nat].xx. 

In contrast, weak terms lift no inconsistencies from object level to a higher level: 

Definition 9 (Weak terms). A term is weak if it contains no i) applied type-level 
variable, and ii) term of the form Elim(f : I[u] Q){f} with t open. 

Extractable terms. From now on, let be an arbitrary set of CCIC terms. 

This set will be used in the conversion definition to restrict the set of extractable 

equations of a given environment: only equation of the form t = u with f and u in 

^+ will be considered. 

At the moment, we only require ^+ to be a subset of 0'. Note that taking = 
does not compromise the standard calculus properties (subject reduction, type 

unicity, strong normaUzation of j3 1 -reduction, . . .) but the decidabihty. E.g., if ^ is 

the Presburger arithmetic, allowing the extraction of 

X[x:" nat] . /x = A [x :" nat] . / (x + 2) 

would require - for checking conversion - to decide any statement of the form 

5^ N (Vx. /(x) = /(x + 2)) ^ r = M, 

which is well known to be impossible. 

Conversion relation. We have now all necessary ingredients to define our con- 
version relation r^r- 

Definition 10 (Conversion relation). Rules of Figure [3] define a family {'^r} of 
CCIC binary relations indexed by a (non-necessarily well-formed) context F. 

Note that the rule [DED] performing deductions in the first order theory, here 
Presburger arithmetic, outputs a certificate [_, _] made of the environment and the 
two terms to be proved equivalent under this environment, each time it is called. 
While this certificate must depend on these three data, it may of course carry addi- 
tional information depending on the considered first-order theory. 

The main differences with the calculus CCpj defined in ||6l are the following: 

• The [App] rule has been split into two rules: [APP^^] and [APP'*']. Conversion 
for strong terms is restricted to )3i -conversion. 

• Conversion for the first argument of an Elim is restricted to j3 1 -conversion. 

• The rules for transitivity and symmetry have been removed, which eases the 
proofs, notably that the deduction part of the conversion relation works at object 
level only. We prove later that the conversion relation is transitive and symmetric 
on well formed terms, thus recovering type unicity. 

• The rules for jSi -conversion perform one reduction step only, which also eases 

proofs. Therefore u v should be understood as 3w s.t. u w and v w. 
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[REFL] [x:'T]er T^,t = u t,u&e+ 



[Eq] 



T~rU t~r.[r.»T]U T^rU f~r.[x:»r]" 

[Lam] [Prod] 



A [x ■."T].tr^r^ [x -."U]. u VCt r) . f ~r V(x :" (7) . u 

f,?',/,/' are weak 

[jSi-LEFT] t^^t' I^rl' Q-^rQ' v~rv' 7~r/ 

Elim(f : / [v] ^ Q){J} ~r Elim(f' : /' ^ Q'){J'} 



t ~p u 



u^u' tr^ru' ti~riii f2~r«2 f,,», are weak 



[J3i-RIGHT] [App* 

E = {J2/^^(W1)(CT) =^^j-(W2)(ct) 

^1__L^: L_L^J L [DED] 

t^pu [r,t,u] 

Fig. 3 CCIC conversion relation 



2.6 Decidability of type-checking 

CCIC enjoys all needed meta-theoretical properties (strong normalization, conflu- 
ence, subject reduction), and therefore consistency follows; 

Theorem 1. There is no proof of \/{x : •k).x in the empty environment. 

All proofs are similar to those made for PTSs with the same succession of meta- 
theoretical lemmas, but need more preparation. This is in particular the case with 
the substitution lemma which is much harder than usual. 

As said, type-checking in a dependent type theory is non-trivial, since the rule 
[CONV] is not syntax-oriented. The classical solution to this problem is to eliminate 
[CONV] and replace [ APP] by the following rule. The proof is not difficult. 



rht:y{x:''U).V rhu:U' U^rU' 

if a = r and U -^h. f i =r f2 with f i , f2 G i^then f i ~r to must hold 

[App] 

r\-tu: V{xh^u} 

Decidability of type-checking in CCIC therefore reduces to decidability of ^j-, 
the environment F being arbitrary, possibly containing ill-formed terms or even 
being inconsistent. To show that r-^r is decidable, we proceed as previously, by 
modifying the definition in order to make it syntax-oriented: we show that two arbi- 
trary terms are convertible iff their j3 1 -normal forms are convertible by the syntax- 
oriented weak convertibility relation given at Figure|4] in which, to any environ- 
ment r, we associate the set Eq(F) = {f = m | [x T] G —>tt = u,t,u E jz/}. 
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Lemma 1. Given F an environment and t two terms, t^pu ifft Ipi fnpu Ipi- 

This is the main technical resuh of the decidability proof, which proceeds by 
induction on the definition of ~r- Note that the numerous conditions of the form 
^,Eq(F) ^0 = 1 in the rules defining are required to make them mutually 
exclusive. 



[REFL-*] [REFL-D] Z [ReFL-^] 

T^rU fwr,[x:<T]" 

t,ueff J7,Eq(r)l=0=l 5^,Eq(r)l/0=lor 

[Uns AT] A [x :" r] . / and A f/] . ii not in ff 

t^fu [Lam] 

X[x:" T].t^rKx:" U].u 

' = '' J-n' Q~rQ' v~rv' 7~r7' 
T^rU t^r.[x:»T]ii f, /',/,/ are weak^,Eq(r) 1/0= 1 or 

V(.:"r)...rV(.-t/).« '""""^ Elim(,... ){...} and Elim(r',... ){...} not m^^ 

Elim(f : / [v] ^ Q){f} «r Elim(/' : /' [¥] ^ Q'){f'} 



t\ t2 and Ml «2 not in 
t\ t2 or/and ui U2 is not weak 



t = C,[ai,...,at] u = Cu [ak+ 1 , . . . , Uk+i] 
[ApP'*' ] Ci or C„ is a non-empty algebraic context 

t\ t2 ~r "1 !'2 all the a,'s have empty algebraic caps 

the c,'s are fresh constants s.t. t, = c, iff a, ~r ^'z 
?i~r"i ?2~r"2 weak ^7 c^^r^ t /" ^i_/-r^ ^ i 

^,Eq(r) ^0 = 1 or -^y.Eqy) N C,[ci, . . . - C„[c,+i , . . . ^^^^^ 

fi and »i (f2 not in 6' f ~r " 
[App* ] 

t\ t2 «r "1 "2 
Fig. 4 CCIC syntax-oriented conversion 



Example 5. LetF = [c : nat], [/? {X[x : nat].x)0 = c]. Then {X[x : nat].x+x)OKir c 
and {X[x : nat] . x -I- x) «r c, using congruence and deduction of and Rip ■ 

In contrast, j3 -reducing {X[x : nat].x + x)0 yields + 0~rc, but not O + O^pc- 
Indeed, {X[x : nat].x+-i:)0 and + are no more -convertible, a direct conse- 
quence of removing j3 1 -reduction from ^p : the equation {X[x : nat] . x) = c cannot 
be used anymore, since + is not «r convertible to {X[x : nat].x)0). 

Now, normalizing all terms as well as the environment F, we can recover con- 
vertibility for w: + 0«rj^, c, the extractable equation of F^p^ being now = c. 

As a consequence, we obtain: 

Theorem 2. ^p is decidable for any environment F when taking for 0^ the set of 
terms that are reducible to an algebraic terms. 

and therefore, our main result follows: 

Theorem 3. The type-checking relationship F \~ t : T is decidable in CCIC. 
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3 Using CCIC 

We give here a detailed example illustrating the advantages of CCIC, based on the 
inductive type of words introduced in Section lSTI 

In Coq. First, we give a development in Coq, therefore based on CIC. 

Variable T : Set. 

Inductive word : nat -> Set : ^ 
I epsilon : word 
I char : T -> word 1 

I append : forall n p, word n -> word p -> word (n+p) . 

Lemma plus_n_0_transparent : forall n, n+O^n. 
Proof, induction n as [I n IHn] ; simpl; 

[idtac I rewrite -> IHn]; trivial. Defined. 

Lemma plus_n_Sm_transparent : forall n m, n+(S m)^S(n+m} . 
Proof, intros n m; induction n as [I n IHn] ; 

simpl ; [idtac I rewrite -> IHn]; trivial. Defined. 

Lemma plus_assoc_transparent : forall n p q, (n+p) +q^n+ (p+q) . 
Proof, intros n p q; elim n; [trivial I intros k] . 
simpl; intros H; rewrite -> H; trivial. Defined. 

Definition reverse_acc : forall n, word n -> forall p, word p -> word (p+n) . 
Proof, intros n wn; induction wn as [ | c | n p wn IHwn wp IHwp] ; 

intros k wk . rewrite plus_n_0_transparent ; exact wk . 

rewrite plus_n_Sm_transparent ; rewrite plus_n_0_t ran spar ent ; 
exact (append (char c) wk) . 

rewrite <- plus_assoc_transparent; exact (IHwp _ (IHwn _ wk) ) . Defined. 

Fixpoint reverse n (w : word n) {struct w} : word n :^ 
match w in word k return word k with 

I epsilon ^> epsilon 

I char c ^> char c 

I append nl n2 wl w2 ^> reverse_acc w2 wl end. 

The example of/7(3//n(irame^ as words satisfying the property word_eq m reverse m 
is carried out in Strub's thesis (see his website). It yields a much more complex Coq 
development than the above, since it involves the equality over (quotients) of words. 

In CCIC. We now make the similar development in CCIC, using a self-explanatory 
syntax. The definition of reverse reduces then to: 

Fixpoint reverse n (w : word n) {struct w} : word n :^ match w with 
I epsilon ^> epsilon 

I char c ^> char c 

I append _ _ wl w2 ^> append (reverse w2) (reverse wl) end. 

Typing of the third clause of reverse will use here Presburger's arithmetic, since 

append nl n2 wl w2 has type word (nl + n2), while append n2 nl w2 wl 

has type word (n2 + nl), two types that are not convertible in CIC, but which 
become convertible in CCIC. We can easily see with this example the immense ben- 
efit brought by internalizing Presburger's arithmetic. Note that a single certificate is 
generated for this conversion: 

[nl : nat, n2: nat, wl : word nl, w2 : word n2, nl + n2, n2 + nl] 
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4 Conclusion 

CCIC is an extension of CIC by arbitrary first-order decision procedures for equal- 
ity. We have shown here with a detailed example using Presburger's arithmetic the 
benefit of the approach with respect to the current implementation of Coq based on 
CIC: more terms can be typed especially in presence of types such as dependent 
lists which become easy to use; many proofs get automated, making the life of the 
user easier (developing the example of reverse for dependent lists in the currently 
distributed version of Coq took us a day of work, and we don't believe this can 
be shrinked to one hour); and proofs are much smaller, some seemingly complex 
proofs becoming simple reflexivity proofs. We believe that the resulting style of 
proofs becomes much closer to that of the working mathematician. 

We have also explained the advantage of the approach insofar as it allows to 
clearly separate computation from deduction, therefore allowing for an incremental 
development of the kernel of the system. 

So far, we have considered only decidable -equality- theories. However, thanks 
to the decidability assumption, a decidable non-equality theory can always be trans- 
formed into a decidable equality theory over the type Bool of truth values equipped 
with its usual operations. 

There are still many directions to be investigated. A first is to embed membership 
equational logic in CIC along the lines of the simpler embedding described here. A 
second is to consider the case of dependent algebras instead of the simpler paramet- 
ric algebras. This is a much more difficult question, which requires using a stronger 
notion of conversion in the main argument of an elimination, but would further help 
us addressing other weaknesses of Coq. 

Finally, we strongly believe that the use of decision procedures outputing certifi- 
cates when they succeed and explanations when they fail will change our way of 
making formal, and enlarge the audience of proof assistants. 

Acknowledgement. We thank the Coq group for many useful discussions and 
suggestions, and the referees for their useful remarks. 
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